Skip to content

Add public large-corpus scale guard#101

Merged
SonAIengine merged 3 commits into
mainfrom
codex/public-scale-guard
Jul 2, 2026
Merged

Add public large-corpus scale guard#101
SonAIengine merged 3 commits into
mainfrom
codex/public-scale-guard

Conversation

@SonAIengine

Copy link
Copy Markdown
Contributor

Summary

  • extend the Tier-1 benchmark runner with BEIR FiQA, TREC-COVID, and SciFact plus staged corpus limits
  • speed up SQLite batch FTS ingest by skipping FTS deletes for newly inserted nodes
  • add public scale guard workflow and scale report covering FiQA 57k and TREC-COVID 171k runs

Scale results

  • FiQA 57,638 docs build improved from 577.4s to 58.4s; search 1.4s over 10 queries
  • TREC-COVID staged/full smoke: 50k build 49.3s/search 1.3s; 171,332 build 370.4s/search 5.2s

Tests

  • uv run --extra dev ruff check
  • uv run --extra dev ruff format --check
  • uv run --extra dev pytest tests/test_tier1_benchmarks.py tests/test_backend_sqlite.py tests/test_backend_sqlite_graph.py tests/test_real_scores.py tests/test_memory_operating_layer.py::test_graph_mutations_record_memory_events tests/test_memory_operating_layer.py::test_graph_add_can_skip_memory_event_for_bulk_loads -q
  • CI-like local suite earlier in this branch: 1474 passed, 5 skipped

@SonAIengine

Copy link
Copy Markdown
Contributor Author

Public Scale Guard dispatch note:

  • GitHub API cannot workflow_dispatch this new workflow before it exists on the default branch: workflow .github/workflows/public-scale.yml not found on the default branch.
  • I verified the exact guarded commands locally on the PR branch with thresholds enabled:
    • FiQA 10k: build 4.9s, search 0.2s, MRR@10 0.425, Hit@10 3/5
    • TREC-COVID 50k: build 28.6s, search 1.3s, MRR@10 0.933, Hit@10 10/10
  • PR CI remains green. After merge, the workflow should become visible for scheduled/manual dispatch.

@SonAIengine SonAIengine merged commit 7be1337 into main Jul 2, 2026
2 checks passed
@SonAIengine SonAIengine deleted the codex/public-scale-guard branch July 2, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant